Efficient Exploration in Reinforcement Learning Based on Utile Suffix Memory

نویسنده

  • Arthur Pchelkin
چکیده

Reinforcement learning addresses the question of how an autonomous agent can learn to choose optimal actions to achieve its goals. Efficient exploration is of fundamental importance for autonomous agents that learn to act. Previous approaches to exploration in reinforcement learning usually address exploration in the case when the environment is fully observable. In contrast, we study the case when the environment is only partially observable. We consider different exploration techniques applied to the learning algorithm "Utile Suffix Memory", and, in addition, discuss an adaptive fringe depth. Experimental results in a partially observable maze show that exploration techniques have serious impact on performance of learning algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Exploration in Reinforcement Learning Based on Short-term Memory

Reinforcement learning addresses the question of how an autonomous agent that senses and acts in its environment can learn to choose optimal actions to achieve its goals. It is related to the problem of learning control strategies. In practice multiple situations are usually indistinguishable from immediate perceptual input. These multiple situations may require different responses from the age...

متن کامل

Instance-Based Utile Distinctions for Reinforcement Learning with Hidden State

We present Utile Suffix Memory, a reinforcement learning algorithm that uses short-term memory to overcome the state aliasing that results from hidden state. By combining the advantages of previous work in instance-based (or “memorybased”) learning and previous work with statistical tests for separating noise from task structure, the method learns quickly, creates only as much memory as needed ...

متن کامل

Resolving Perceptual Aliasing In The Presence Of Noisy Sensors

Agents learning to act in a partially observable domain may need to overcome the problem of perceptual aliasing – i.e., different states that appear similar but require different responses. This problem is exacerbated when the agent’s sensors are noisy, i.e., sensors may produce different observations in the same state. We show that many well-known reinforcement learning methods designed to dea...

متن کامل

Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks

This paper presents U-Tree, a reinforcement learning algorithm that uses selective attention and shortterm memory to simultaneously address the intertwined problems of large perceptual state spaces and hidden state. By combining the advantages of work in instance-based (or “memory-based”) learning and work with robust statistical tests for separating noise from task structure, the method learns...

متن کامل

Learning Task-Relevant State Spaces with a Utile Distinction Test

This paper presents a reinforcement learning algorithm that learns an agent-internal state space on-line, in response to the demands of the task—thus avoiding the need for the agent designer to delicately engineer the agent's internal state space. The algorithm scales well with (1) large perceptual state spaces by pruning away unnecessary features, and (2) “overly small” perceptual state spaces...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Informatica, Lith. Acad. Sci.

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2003